
Li! 

if r 

fir* 



m 



WHAT IS CLAIMED IS: 

An information retrieval system for retrieving infor- 
mation user seeks from a plurality of documents, compris- 
ing : 

5 document storage means for storing the plurality of 

documents ; 

feature ambunt extraction means for extracting a feature 
amount of each o:£ the plurality of documents stored in the 
document storing mefens ; 
10 clustering meansy for classifying the plurality of docu- 

ments stored in the document storage means into a plurality 
of clusters based on th>e extracted feature amounts so that 
each cluster includes one\ document or a plurality of docu- 
ments having feature amount a similar to each other; 
is document retrieval meansVfipr retrieving a document sat- 

isfying a retrieval condition Mj.nput by the user among the 
plurality of documents stored in\the document storage means; 
and 

interface means for presenting the retrieved document 
20 together with the rest of documents included in a cluster to 
which the retrieved, document belongs if\the cluster includes 
a plurality of documents , as retrieval resoilts . 

2. The information retrieval system of\ Claim 1, wherein, 
from each of the plurality of documents stored in the docu- 
25 ment storage means, the feature amount extraction means ex- 
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tracts, as the feature amount, a feature vector including as 
elements a pair of a term appearing in the document and a 
weight wri_th which the term characterizes the document. 

3. Tfte information retrieval system of Claim 1, wherein 
the clustering means adopts a way of clustering that provides 
the largest number of clusters each including a plurality of 
documents . \ 

4. The inforiftation retrieval system of Claim 1, further 
comprising cluster Vabel preparation means for preparing a 
plurality of clusteA labels respectively representing the 
contents of the plurality of clusters, 

wherein the interface means presents a cluster label 
representing the contents \of the cluster to which the re- 
trieved document belongs amoiid the plurality of cluster la- 
bels prepared, together with/^ite retrieval results. 

5. The information retWievai system of Claim 4, wherein, 
for each of the plurality of clusters, the cluster label 
preparation means selects one or a plurality of terms charac- 
terizing the cluster from all documents belonging to the 
cluster as the cluster label. \ 

6. The information retrieval system\pf Claim 4, wherein, 
for each of the plurality of clusters , Ythe cluster label 
preparation means selects one sentence characterizing the 
cluster from all documents belonging to the\ cluster as the 
cluster label. \ 



\ 7. The information retrieval system of Claim 4, further 
comprising document label preparation means for preparing a 
plurality of document labels respectively representing the 
contents Vf the plurality of documents stored in the docu- 
ments storage means, 

wherein Vthe interface means presents document labels 
representing tlie contents of the documents included in the 
cluster to whicK the retrieved document belongs among the 
plurality of document labels prepared, together with the re- 
trieval results. \ 

8. The information retrieval system of Claim 7, wherein, 
for each of the plurality of documents stored in the document 
storage means, the docunvent label preparation means selects 
one sentence characterizirm the document from all sentences 
in the document as the docum^ntk label . 

9, The information retrieval, system of Claim 1, wherein 
the plurality of documents inW?ffdes a plurality of question 
documents and a plurality of /answer documents associated with 
each other, \ 

the retrieval condition is a natutal language user ques- 
tion, \ 

the feature amount extraction means \extracts a feature 
amount of each of the plurality of answer documents stored in 
the document storage means to enable the plurality of answer 
documents to be classified into a plurality otf clusters by 



the xclustering means, 

tthe information retrieval system further comprises simi- 
larity Operation means for calculating similarity between 
each of tKe plurality of question documents stored in the 
document storage means and a document of the user question, 

the document retrieval means retrieves a question docu- 
ment having higFi similarity among the plurality of question 
documents stored \ii the document storage means based on the 
calculated similariw, and retrieves an answer document asso- 
ciated with the retrieved question document among the plural- 
ity of answer documents\ stored in the document storage means, 
and \ 

the interface means presents the retrieved answer docu- 
ment together with the rest Ve answer documents included in a 
cluster to which the retrievedVkViswer document belongs if the 
cluster is composed of a pluriaiALty of answer documents, as 
the retrieval results. \ V 

10, The information retrieval system of Claim 9, wherein 
the interface means presents the retrieval results to the 
user. \ 

11. The information retrieval system of Claim 10, 
wherein the interface means receives selection of an answer 
document among the presented retrieval results by the user, 
and \ 

the information retrieval system further comprises docu- 
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\ent updating means for retrieving a question document asso- 
ciated with the selected answer document among the plurality 
of qiiestion documents stored in the document storage means, 
and, i& the similarity between the retrieved question docu- 
ment and\the document of the user question is lower than a 
predetermined value, newly storing the document of the user 
question inVthe document storage means in association with 
the selected answer document. 

12. The information retrieval system of Claim 9, wherein 
the interface mea\as presents the retrieval results to an ex- 
pert together with\ the document of the user question, and 
presents an answer document selected by the expert among the 
presented retrieval results to the user. 

13. The information\ retrieval system of Claim 12, fur- 
ther comprising document \ updating means for retrieving a 
question document associate^ /w^-th the selected answer docu- 
ment among the plurality of c£\je^tion documents stored in the 
document storage means, and/^i^ \ the similarity between the 
retrieved question documenjt and\the document of the user 
question is lower than a predetermined value, newly storing 
the document of the user question \n the document storage 
means in association with the selected ^answer document. 

14. The information retrieval system of Claim 9, wherein 
the interface means presents the retrieval results to an ex- 
pert together with the document of the user question, and 
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presents to the user a natural language answer document input 
by the expert with reference to the presented retrieval re- 
sults . 

15. \The information retrieval system of Claim 14, fur- 
ther comprising document updating means for newly storing the 
document of\the user question and the input answer document 
in associaticm with each other in the document storage means 
if the similarity between each of the plurality of answer 
documents stored in the document storage means and the input 

10 answer document is lower than a predetermined value. 

16. An information retrieval system for retrieving in- 
p formation a user se^ks from a plurality of documents, com- 
s prising: 

3 2 

lil document storage Aeans for storing a plurality of ques- 

2; is tion documents and a plu^^ity of answer documents associated 
with each other; 

similarity operatiofn ^ans for calculating similarity 
between each of the plurality\of question documents stored in 
the document storage means and\a document of a user question 
20 when the user question in natural language is input by the 
user; 

document retrieval means for retrieving a plurality of 
question documents having high similarity among the plurality 
of question documents stored in the document storage means 
25 based on the calculated similarity, ancl retrieving answer 
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documents associated with the respective retrieved question 
documents among the plurality of answer documents stored in 
the document storage means; and 

interface means for presenting to an expert the plural- 
ity of retrieved answer documents together with the document 
of the user \ruestion, and presenting to the user an answer 
document selected from the presented retrieval results by the 
expert or a natural language answer document input by the ex- 
pert with referenceyto the presented retrieval results. 

17. The information retrieval system of Claim 16, fur- 
ther comprising document updating means for retrieving a 
question document associated with the selected answer docu- 
ment among the plurality <yf question documents stored in the 



document storage means, and 
retrieved question document 



_f the similarity between the 
M the document of the user 
question is lower than a pr^derarmined value, newly storing 
the document of the user question in the document storage 
means in association with the selected answer document. 

18. The information retrieval system of Claim 16, fur- 
ther comprising document updating means \f or newly storing the 
document of the user question and the infcut answer document 
in association with each other in the document storage means 
if the similarity between each of the plurality of answer 
documents stored in the document storage means\and the input 
answer document is lower than a predetermined valVe. 
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